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Preface 


This report summarizes some preliminary results obtained in the course 
of examining the behavior of genetic operators as used in function 
optimization. The companion to this report [Bosworth, Foo and Zeigler, 
"Comparison of Genetic Algorithms with Conjugate Gradient Methods", University 
of Michigan Technical Report No. 00312-1-T, 1972] presents the actual 
implementation of such operators. Here we present some theoretical properties 
of two operators, namely crossover and inversion. 

This investigation has raised more questions than it has answered. 

Time has not permitted us to pursue them further. 

Numerous discussions with Bernard P. Zeigler have crystallized many 
concepts which would otherwise have remained hopelessly opaque. We have 
also followed his suggestions in many places as to the mode of presentation. 
John H. Holland originally conceived of the idea of genetic operators in 
a more general setting and we thank him for this inspiration. Roger Weinberg 
implemented a computer program for genetic operators in 1970 [Weinberg, 
"Computer Simulation of a Living Cell: Interdisplinary Synergism" University 
of Michigan Technical Report No. 01252-3-T, 1970] which suggested that 
our proposed enterprise was at least feasible. 


Section 0 


Basic Concepts 

We begin by setting forth the basic concepts of crossover and inversion 
as an intuitive basis for the mathematical development to come. We urge 
the reader to consult our paper (Bosworth, et^ al. , 1972) for illustration 
in the context of an actual optimization system. 

Both crossover and inversion are operators on "strings". A "string" 
is an ordered n-tuple with an associated permutation of {l,...,n}. 

Crossover acts on two strings to yield two new strings which are the two 
original strings with some corresponding corrdinates exchanged. E.g., 
crossover on (a^a^a^) and (b^jb^b^) might yield O^a^a^) and (b^a^b,^. 
The associated permutation is the rule for correspondence of coordinates. 
Inversion acts on a string by reordering the ordered n-tuple and changing 
the associated permutation in the same way. E.g., inversion might act on 
Ca 15 a 2 ,a 3 ,a 4 ) with (2, 1,4, 3) to yield (a 3 ,a 2 ,a 2 ,a 4 ) with (4, 1,2, 3). 

This interpretation of crossover and inversion is motivated by natural 
and artificial genetics. In natural genetics a string corresponds to a 
chromosome. The order of alleles in a chromosome is arbitrary but no matter 
where an allele appears in the chromosome the character it expresses is 
unambiguous. In artificial genetics, chromosomes must be represented 
by ordered n- tuples of numbers. Here the character expressed by a number 
(allele) means the part which the particular number takes in the evaluation 
of the string. A function is used to evaluate strings so in general a 
correspondence must be set up between the artificial "chromosomes" and 
points in the domain of the function. The associated permutation is the 
needed correspondence. We will call this permutation the "inversion 
pattern" of the string and denote it by an n-tuple (i^ — ,i n ). We wil1 
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call the domain of the evaluation function the function space, say S. 

The correspondence of coordinates in the function space is as follows: 

If j is the i th coordinate of its inversion pattern then the ith coordinate 
of the string is the coordinate of the point in the function space. 

E.g., (a^a^a^a^^^.) with (2, 5, 3, 4,1) corresponds with (a^,a^,a^,a^,a 2 ) eS. 

These interpretations lead to the consideration of strings with associated 
inversion patterns as an extension of the function space. If S is the function 
space, a set of n-tuples, and T is the set of permutations of {l,...,n} 
then a strings, S, with associated inversion pattern, r, may be considered 
as a pair (s,r) e S x T. A point in S x T is evaluated by applying the 
function to the corresponding point in S. 

Normally, for computational reasons, crossover is applied to (s^,r^) 
and fs 2 ,r 2 ) only when r^ = r 2 . In section 6 we will see that for some 
functions this may be relaxed with no added work. 

We summarize the development as follows: 

Section 1 presents an algebraic picture of crossover, which is 
complemented by the geometric interpretation in Section 2. In Sections 3 
and 4, two different but related approaches to the stochastic properties 
of heuristics are discussed. Sections 5 and 6 examine some algebraic and 
geometric aspects of inversion. Finally, Section 7 rounds off the discussion 
with a brief and tentative look at one approach to the evaluation of genetic 
strategies . 
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SECTION 1 


The Algebraic Structure of Crossover 


Let S be a set, n e N and 0 < i z n. 

def: c.j;:{S n ,S n } -»■ {S n ,S n } such that c i (( (a^ , . . . ,a n ) , (b^ , . . . ,b n ) }) = 

{^i* * ’ • » a i-l ,b i ,a i+l i • ■ • > a n ) » ( b i* • ’ * ,b i-l ,a i ,b i+l* ‘ ' * ’hn- 1 ' 


def: If 0 < i < j < n then c. . = ,n.c, 

J ij k=i k 

. r 

def: C = {k k = .n, c. where r e N and for all j < r, i. e N} = set of 

1 1 = 1 ij J 

all crossover operators, 
def: e = c^c^. 


Lemma 1 . 1 


If 0 < i < n then c^c^ = e. This follows directly from the definition 

of c. . 
i 

def: K = <C, function composition 

Notation: function composition will be treated like multiplication since 

it is associative. 


Lemma 1 . 2 


e is the identity of K. 


Proof: c i e({(a 1 ,...,a n ) J (b 1 ,... J b n )}) = c i c 1 ({ Cbj ,a 2> . . . .a^) , ^ ,b 2 , . . . ,b n ) }) 

c.({(a 1 ,... J a n ),(b 1 ,...,b n )}), therefore ^e = c i> 

c.e = c.fc.c.) = (c.c.) c. = ec. = c. . k e C => k = c.k'c. for some 

l liiiii l l ij 

c. ,k',c. e C therefore ek = ec.k'c. = c.k'c. = c.k'c. e = ke = k therefore 
i j i J i J i 3 

e is the identity of K. 
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Let GF(2) = <{0,l},+,">, V = the n-dimension vector space over 

a e V => a = ) where a. e {0,1}. 

I n 1 

Definition 

g:V -> P({l,...,n}} by g(a) = {i | ot^ = 1} this is obviously a 1 to 1 
onto map. 

Definition 

II c. if a ^ 0 
f:V -> K by f(a) = ieg(a) 1 

e if a = 0 

Lemma 1 . 3 

f is well-defined. 

Proof : 

a, 6 e V and a = B -*■ g(a) = g(B) so if = c^c^ for all 1 < i, 

j < n, n c. = n c. . 
ieg(a) 1 ieg(B) 1 


If i = j 

c.c. = c . . 

= c . . = c . . . 

i j li 

33 Ji 

If i H 

c i c j ({fa i’ 

• • ■ ,a n ) , (b 1 ,. . . >b n )}) = c i ({ (a 1 , . . . ,b^ , . . . ,a n ) 

(b^ . . . ,a . ,. . . 

,b n )}) = ({0 

, . . . ,b^, . . . ,b . , . . . , a^) > Cb j , . . . , a^, . . . , a^ , . . • 

= CjC^l (a ]L , . . 


•>V»- 

Therefore c.c. 

= c.c.. 


i 3 

I i 


Therefore f(a) 

= f(P) 3 therefore f is well-defined. 


Theorem 1 . 1 

f is a homomorphism. 
Proof : 

Let a, p e V and a+p / 0 

f (a+P} = II c . . 

ieg(a+P) 1 


c. c. ... c. c. . . . c. = c. . . . c. c. ... c. = e therefore 
X 1 X 1 1 r x 2 1 r x 2 x r x 2 x r 


induction k e K => kk = e therefore K is a 2- group. 


by 

Q.E.D. 


def: R = {{a,3}[a = {1 , . . . ,n}-3}. 


Theorem 1 . 2 

There is a one-to-one correspondence between K and R. 

Proof: aeR=>a={a,3} where a = {1, . . . ,n}-3. Let f :R -*■ K be 

defined by f(a) = n c. if i ^ 0 
iea 

e if a = 0 


a = 3 => II c. = It c.. Let a = {l,...,n}-3, a = 0 => II c. ({ (a. , . . . ,a ). 
iea 1 ie3 1 ie3 1 

(b 1 ,. . . ,b n )}) = {Cb 1 ,...,b n ),(a 1 , ...,a n )> = e({ (a^ . . . ,a n ) , (bj, . . . ,b n ) }) . 


Let {i ,...,i } = a, {j,,...,j } = 3 then H c. n c. = n 


iea X je3 ^ ie{l,...,n} 1 


c. = e. 


Since inverses are unique, II c. = II c. , therefore a = b => f(a) = f(b) 

iea ie3 

therefore f is well-defined. Let k e K. Then k = c. ... c. for some 

1 i 2 

1 r 

r e N, 0 < ij,...,i < n. Let a = {j|j = i^ and there are an odd number 

of i^ such that = j}; then k = f(a) where a e a e R. Therefore f 
is an onto function. |P({1, . . . ,n}) ] = 2 n therefore [ R | = 2 n |k| = the 
number of different crossover operators = number of different pairs of points 
which may result from crossover on a pair of points. 

There are 2 n ordered pairs of such points so that without order this 


is 2 n ^ . Since | K | = [ R | < °° and f is onto, f is one-to-one. 


Q.E.D. 


Remark: R may now be used as a meaningful index set for K. 

Notation: k e K means k = f(a) where a e a e R. 

a a 
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If i e g(°0 A g(B) then i fc. g(a+B) and occurs in f(a) and in f(&). 

Thus f(a)fCB) has exactly two occurences of c^. c^ commutes with all 

c.: therefore, ffalffB} =c. ,...,c. c.c. =c. ,...,c. e = c. ,...,c. . 

3 3 ill 3i 3 3- 3 

j J r J 1 J r J r 

xl. 

Therefore, f(a+3) has the same effect in the i tn place as f(cOf(g}. 

If i e (g(a) - g(B))U(g(S) - g(a)}. c^ occurs only once in f(a)f(&) 
and once in f(ot+B). 

Therefore, f(a+B) = f(a)f(3) = f(g+a) = f(£)f(a). Q.E.D. 

Lemma 1.4 

f is onto and has kernel { (0 , . , . ,0) ^ , (1 , , , . , 1) } . 

Proof : 

By the definition of the c. operators e = II c. . 

1 ie{ 1 , . . . ,n} 1 

Therefore, f((0,...,0)) = f((l,...,l)) = e. 

Therefore, f has kernel at least { (0 , . . . ,0) , (1 , . . . , 1) } . f is onto because 
k e K is e or may be written as k = c.c. ,...,c. where no c. occurs twice 

J 111 1 

2 2 l 

since c. = k and c.c. = c.c.. 

i 1331 

Suppose a £ £ and a ^ 1_ , f (a) = II c.. a^0=> there is 

, ieg(a) 1 th 

an i 3 a. = 1, a/l=> dj^a^O then f(a) acts on the i coordinate 

but not on the j^. 

Therefore, f(a) ^ f(0). 

Therefore, ker f = {0,1}. 

Therefore, K ~ V/ker(f) 

Notice: ker(f) is isomorphic to the two element group. 

Corollary 

K is a commutative group. 
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Proof: 


V is a commutative group and f is a homomorphism, therefore K is a 
commutative group by a homomorphic theorem. 

Notation: k Q e K means f(g ^(a)) if a e P({l,...,n}). 

Corollary 

K is a 2 group. 

Proof : 

a e V => f(a+a) = f(0) = f(a)f(a) = e. 

The group structure on K does not seem to answer any questions which 
are being presently asked. However, these results show a very specific 
structure about which many things are known. Therefore in the future 
they may prove to be very useful. 

The notation developed in this section will be used throughout this 
paper. 
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SECTION 2 


Generalized crossover operators act on sets of points to yield new 
sets of points. There are some interesting properties of the geometry 
of these point sets which will now be investigated. In what follows the 
set S as defined in Section 1 is identified with IR , the set of real numbers, 
although this restriction may be relaxed later. 


Notation: 


j | j | is the Euclidean norm in ]R n , and <,> is the inner product 
in Euclidean space IR™ 


That is. 



n 

<x,y> = £ x.y. 

i=l 1 1 


Notation: If x^,x^ e IR n , then denote the pair k Q ({x^ ,x^ }) by 

e JR n x IR™ where is a generalized crossover operator 
as previously defined in Section 1. 

Remark: As should be clear from earlier discussions, the pairs above 

are not necessarily ordered unless some convention is adopted which associates 
x^ with y^. There is no a priori reason why any one convention is 
"best" in an obvious way. 

Notation: In Section 1, if a e P{1,2,3, . . . ,n} then a = {i^,^,^ ... i m ) 

where {i^} are the indices of coordinates which get crossed^-over when 
k Q is applied. 

Let a = {1,2, . . . ,n}-a. 

Using this notation we may specify the result of a k a operator as 
follows : 
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If k {x^,x^} = {y*' 1 ' ) ,y^ > then 


,(1) _ w here j = 


yP^ ~ x. (j) where j = 


( 1 if i e a 

2 if i e a 

2 if i e a 

1 if i e a 


Remark: We have no reason for naming the product points y^, y^ 

in any unique way. 

Lemma 2 . 1 


(a) 

ll*™ 

1 

X 

h-» 
v— / 

II 

I|X« 

1 

N) 

w 

00 

||x« 

-y (2) ll - 

Mx® 

- y ci) 

/ — s 

o 

v / 

||xW 

-* (2) ll - 


-y t2) 


Proof: 


| X U) . Y m 


| X C2) . y (2) 


[l^ - ]* 

[ £ ( X . (1) - x { 2) ) 2 1 * 

l_iea J 

c«v] 

[ t (4 r> - 1 2 

Liea J 


Cx< 2 ) - x: 

i 


The proofs for (b) and (c) are similar. 
Lemma 2.2 


For all Z e IR 

llx 


n 


C1) - zll 2 <■ l|x C2) - z | | 2 = | |y 


C1) - z| 1 2 * I |y 


t2) - zll 2 
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Proof: 


ll« (1) - Z| | 2 * llx' 2 ) 

- z[ | 2 = 

E c*™ 

i=l 1 

- y 2 - l Cxf) - z.) 2 
1 i=l 1 1 


= 

E (xf« 

iea 

- z.) 2 - E C*' 2) - Z.) 2 

i' n? i i 

ljea 

* Eu™ - z t ) 2 * 

iea 

E cxf’ 

iea 

-v 2 - I 

|y a) - zl I 2 - l|y (2) - z| 

Corollary 2.2 

(a) Mx^-x^M 2 

- 1 |y w) 

- * ( 2 ) ll 2 * 

lly (1) -* a) ll 2 

Cb) 

- ll/ (2) 

(2) | ,2 
- * II + 

| |y f2) - x £1) | | 2 


Proof: 


Let Z = x^ 
||x (1) - x 


in the 
(2) | | 2 

lemma: 

- I|y (1) 

- x^l 

I 2 - 

l|y (2) 

-x^H 2 


= 

l|y (1) 

- *< 2 >| 

,2 

1 + 

l|y (1) 

-* m l| 2 


by Lemma 2.1(a). 


The proof for (b) is similar. 


Remark 1: This corollary is symmetric in x and y and we can quite happily 

exchange their roles. 

Remark 2: The result in Corollary 2.2 suggests, from an elementary theorem 

in geometry, that possible loci for y^ and y^ are on the surface of 
an n-sphere with (x^ + x^)/2 as center and diameter | |x^ - x^||. 
This is in fact the case. In order to establish it a lemma is needed. 
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Lemma 2 . 3 


<(y Cl) - x ( . 15 ), (y (i:) - x (2) 1> = 0, i-l,2. 
Proof: 

For any component c; of the inner product 

c. = (y^ - x^)(y^ - x< 2 >) 
3 w 3 3 7 3 3 


If j i. a, the first term is zero, and if j e a, the second term is zero. 

Hence in any case = 0, and the result follows. 

(21 

The proof for y J is similar. 


Theorem 2. 1 

If k a {x^ ,x^ } = {y^^y^} then y^ and y^ 2 ^ lie on the surface 
of an n-sphere centered at = (x^ + x^)/2, with radius | |x^ - x^||/2. 

Moreover, y^ 1 -* and y^ lie on extremeties of a diameter of this 
n-sphere. 


Proof: 

For the first part it suffices to show that y^ (or y^ - since 
the proof is similar) satisfies the equation of an n-sphere as above, 

i - e '- *cn tx C2) x ci) - x (2) 


or 


Z - 


(Z 


x (1 >) «■ 


(Z 


( 21 , 


lx™ - x^il 2 


The L.H. S. expands to I 

| | Z - x^|j 2 + ||Z - x C2D | | 2 + <(Z - x a) ),(Z ^x C2 b>. 

Let Z = y ^ . Then by Lemma 2.3, the inner product term vanishes, and 
the L.H.S. reduces to | |y^ - x^|| 2 + ||y^ - x^|| 2 which, by 
Corollary 2.2 (a) is equal to the R.H.S., thus proving the first part. 
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The second part is suggested by Lemma 2.1 (c) and may be shown 


directly by observing that 


y 


Cl) 






+ X 


( 2 ) 


03 


which is easily verified. 

Remark: Theorem 2.1 has a very simple interpretation. Suppose we begin 

with two points in !R n . Then crossover constrains the two new points to 
lie on the surface of a hypersphere with the mid-point of the original 
points as center, and their distance apart as diameter. So, if 
"daughter" points are subject to crossover their products are again constrained 
to lie on the same hypersphere. The metric properties in Lemmas 2.1, 

2.2, and corollary 2.2 are obvious properties following from this theorem. 

For further development the notion of a minimal bounding sphere is 
required. Intuitively, suppose a set of points SCiR n is given; we seek 
a "smallest" n-sphere which can contain all of these points of S. 

Clearly , at least one such covering n-sphere exists. So as a first 
attempt at this formalization: 


Definition 2.1 : is an admissible bounding n-sphere for S if S^S^. 

Definition 2.2 : Let {S^} be the set of all admissible bounding n-spheres 

for S. Then let r^ * 1_ diam(S^j . The minimal bounding n-sphere (M.B.S.) 
of S is S m where m = inf{r^|r^ = 1^ diam(S^)}. 

Remark: The above definitions have to be "tightened up" later - for 

instance, there is the question of characterization of a M.B.S. in terms 
of the points which it bounds. This was not investigated. However, Zorn's 
Lemma and the symmetry of n-spheres, suggests that the M.B.S. is unique. 
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The question naturally arises as to how fast crossover enables an 
initial point set SclR n to "search” a space. This is, in a sense not 
yet fully defined, equivalent to asking how quickly the M.B.S. for the 
point set S can expand. To this end a theorem is proved: 

Theorem 2. 2 (Refer to Figure 2.2) 

Let Sq be the M.B.S. for S^ir” its radius, and x^ its center. 

Then the maximal M.B.S. S, for k (S) has x_ as center and radius v / 2~" r_. 

1 a 0 0 

Remark: Before proving the theorem a comment about "maximal" M.B.S. is 

in order. Since k a (S) is different, in general, for different a, and it 
is clear that diam(k a (S)) has some upper bound, by the maximal M.B.S. 
we mean the bounding sphere for the largest possible expansion rate over 
one crossover generation; i.e., we are looking for a £..u.b. We always 
assume that S is a bounded set. 

Proof: 

Since the proof is entirely algebraic, its geometric motivation will 
be more transparent if occasional reference is made to Figure 2.2. None 
of the arguments below, however, rely on geometry as such. We can proceed 
as follows: 

Let r be a (radius) vector centered at Xq. Let h = yr, 0 < y < 1. 

For a unit vector u orthogonal to r, <u,r> = 0. Let Xg+h = x^ then the 
equation of a line passing through x^ is 

Z = x Q +h+Xu, X e ITR. 

The equation of the S^ being 

I l z - x 0 | ! = V (where I Ml = V 

we have that the line intersects the surface of S^ when 

il z - * 0 M = r 0 = I l h+Xu l I 
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i.e., r 2 = ||yr|| 2 + | 1 Xu | | 2 + 2<Xu,yr> = y 2 r 2 + X 2 since <u,r> = 0 
and l|u|| 2 = 1. Therefore X = ±r Q A-y 2 . 

By Theorem 2.1, Z a and Z b are the two points on S Q for these values 

of X, generalized crossover will produce two points Z &1 and Z^i which lie 

on an n-sphere centered about Z a +Z b with radius | |Z n -Z h | |/2. 

2 

It is easily verified that | 1 Z a~ 2 b[ | = r^/l-y^ and 

Z a +Z b = X n +h = x, . 

2 U 1 

The equation of the n-sphere about x^ with radius Tq^I-U^ is 

I |Z' - Xjl ] = t q A-v 2 - 

Consider the triangle inequality: 

I I Z * - x 0 || < | |Z' - Xjl | + I |x 1 - x Q | | = r 0 /T37 2 + r Q y. 

The bound on the R.H.S. attains a maximum at y = 1_ by simple differentiation; 

y/J 

and with this value of y, | |Z’ - x Q | | = showing that the upper 

bound on ||z* - | | is in fact attainable. 

Moreover, this is attained when Z* - x^ is in the direction of 

r (or h) , since in this case (Z 1 - x.) + (Z. - x.) = ^ T + ^2. r = A r 

1 1 u 2 2 

To complete the proof, observe that a choice of h' = -yr leads to exactly 
the same conclusion on the diametrically opposite end of the n-sphere S^. 

Corollary 2.2 

Let n be any normal on the n-sphere S^. Then maximal expansion in this 
direction can only occur if the intersection of the hyperplane 

<n, [Z - (Xg*V2 r Q n)]> = 0 

and S Q , | |Z-x Q | j = r Q has at least two points of S on the end points of 
a diameter of the intersection (which is a hypercircle) . 
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Figure 2.1 

General Crossover constrains products to lie on a hypersphere. 



Figure 2.2 

The activation for searching for a maximal 
rate of expansion of the M.B.S. in one generation of crossover. 
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Proof: Immediate from the theorem. 


The next question is whether crossover, or more accurately, a sequence 
of crossovers, leads to a bounded or unbounded set of points. The following 
theorem answers this question: 

Theorem 2.3 

Let S Q be a M.B.S. for bounded S£]R n t and x Q its center with radius 

r_. Then if {k } (where a is a countable index set) is a sequence of 
0 a aeo 

generalized crossover operators, the maximal M.B.S. for 

(k k ... k ...)(S) has x_ as center with radius v'n r n . 
a. a. a. ' v 0 0 

i-, 1 9 

12 m 

However, first we prove a useful lemma: 

Lemma 2.4 

If Sq is a M.B.S. for S and 

x^ = max{x£^ |x^ e S} 

mEx / • s / • \ 

x^ = minix^^ |x^ 1J e S} 

min 

then it is not possible that 

(a) x, < x n for some k 

max k 

(b) x, > x n for some k. 

min k 

Proof : 

It suffices to prove (b) , since the proof for (a) is similar. 

Assume the contrary. Then 3 k-)- 

x k . - x 0 ' E k =• °- (2 - 4 - 15 

min k 
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Let Zq be the point obtained as follows: 


z o. 

~ x o. 

i * k 

1 

l 



■ X k 

i = k 


min 


Then | \ X M _ z | | 2 = £ 

(x™ - 

Z ) 2 


• 1 J • 

3 = 1 3 


- E K 

3=1 3 


Ci) _ v -»2_ 0(V (i) 


- x. ) - 2(x£ - x n ) (e. ) 


0. 

3 


0 ' ^k J 
k 


+ e. 


.(i) 


x' " - X- 


Consider the term 2(x/ ^ - x A ) e, - e, 2 . 

v k 0, k k 

k 


- \ u k * 4- 

k 


By hypothesis 


> x. > x n , and so (x,^ - x A ) > 0 
k k. 0, ^ k 0, J 

min k k 


In fact from (2.4.1) and the definition of x^ , 


C*W- x q ) - e j c s0 the term satisfies 


mm 


2(x k i;) “ x o k )e k " £ k " e k there£ore ll*^ " z 0 M Z - H X '' 1J ‘ X 0 I 


2 - 2 _ — f* — ii ,(i) 7 I I 2 ^ I l~d) 


2 2 
S r 0 " £ k 


But this implies that an n-sphere centered at Zq with radius 
yTq - e z .< will be an admissible bounding sphere for S, which is a 
contradiction. 

Proof of Theorem 2.3: 

The notation here is as explained in Lemma 2.4. 

Let y^ be any point of (k k . . . ) (S) . Clearly, there exists an n-tuple 

“l a 2 

whose coordinates are picked from {l,2,...,m}, where |s| = m, such that 
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,CD - x CjD 
k \ 


for j = (J.) k . 


(All that this says is that all crossover products have coordinates which 
are selected from coordinates of the initial set of points). 


Ily 


Ci) - -o' I 2 


E ( r? - * 0 

k=l k °k 


n 


£ c x k j) - * 0 r 

k=l K °k 


j - CJjV 


By hypothesis <: x£^ < x. 


so that 


min 


max 


< x (j) 


Now, 

and 


\ . - x 0 5 x k - x 0 - x k ' x n 

mm u k K °k K max 

0 < 


\ 


- x < r 
0. ~ 0 
max k 


0 < x 


by Lemma 2 . 4 


■o - *k . - r o 

k min 


so that |x. ~ x n I - r n and |x. - x. | < r n . Denote the sets 

max u k U k min °k 0 

I={k| |x k -x | > |x k -x |} 

max k min k 


Then from the inequality above: 


.(j) 


“ X 0, I - \\ “ x 0l I for k e I and |x^ j) - x | < \x. 

V mo V L- VI is> « 


- x. 


J ]r \ aY v, K U, ' ‘ K . 0, 

k max K k mm k 


for k i I so that 


n 


Cj) 


S"4"-V‘ s &:• 5/>-i 


That this bound is indeed attainable is seen by choosing x k - x = r 

max °k 0 

and x - x => -r_ for all k. 
min u k u 
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In such a case two points of (k^ k^ . . . ) (S) will be 

1 2 


Cx o “V x o - r o ••• x o "V and Cx o/ r o x o + V 

1 2 n 1 n 


and verification of the claim is straightforward. 


Remark 1: In the proof of the theorem use was made of the fact that 

x^ ~ x 0 ~ T 0 ^ and a s i- m il ar relationship for Xj, ) . This is clear, 
max k min 

because otherwise x^ " X 0 > r 0‘ T * ien taking inner products between 


max 


th 


any point containing x^ as its k component and the radius vector 

max 

in the k th axis will yield <x^-x Q ,r> = <x^ -x Q ,e k r 0 > 

= Cx k - x o v ,r » * T o 

max k 


so that Tq cannot be the radius of a bounding sphere. Contradiction. 


Remark 2: Attainability as above does not mean that starting from any 

arbitrary population bounded by the upper bound is attainable. For a 

counterexample, consider the case when x^. ~ T o exce P t for some 

max 

subset of indices. 

Remark 3: We proceed to generalize Theorems 2.2 and 2.3, and exhibit in 

the process some alternative (and simplified) methods of proof. 

Suppose a theorem was true for the case when a M.B.S. was centered 
about Xq, with radius r^. Then by a translation of axes we may move the 
origin to x^. Then it is clear that the theorem is also true for a M.B.S. 
centered about 0 with radius r Q . The converse is also obvious. We state 
this as a lemma: (which merely says that translation is an isometry). 


Lemma 2 . 4 

It suffices to prove all results with respect to a M.B.S. centered 
about the origin. 
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[Note: rotations are not allowed, since they do not leave the crossover 

space invariant.]: 

Definition: Let Sg be a M.B.S. in R n centered about 0, and Tg its radius. 

Then a basis set for is {r rt e. ,r rt e 0 , . . . ,r rt e } where {e.,e^,...,e } is 

0 0 1 0 2 On 12 n 

the standard basis for R n . The reflection of a basis set is 
{-r 0 e l , " r 0 e 2’ ’ * * ’" r 0 e n*’ 

From now on, unless otherwise mentioned, we assume that Sg is centered 
about the origin. This does not restrict the validity of the results since 
(by the preceding remarks) Sg may be translated to a center at arbitrary x Q . 

In this vocabulary we may restate the remarks following Theorem 2.3 as 
Lemma 2 . 5 

The maximal v^nrg bound on SCSgCR n is attainable if and only if S 
contains a basis set and its reflection. 

The "if" part is clear from the example following Theorem 2.3. It 
remains to show necessity, but first the notion of quadrant and some 
preliminary results are discussed. 

Definition: Let a e P(N) where N = {1,2,3, ... ,n}. Then by a quadrant in 
R n is meant a set of the form { (x^,x 2 , . . . ,x ) | x^ > 0 iff i e a}, denoted 


V 

3 

As an example in R , the set of all (Xj,x 2 ,Xg) such that all x^ are 
positive constitutes the quadrant 2 Clearly, in n-space there 

are precisely 2 n quadrants. 

Definition: Two quadrants and are diametrically opposed if 

Sj = {(Xj.x^. .. ,x n ) | x i > 0 iff i e a}, S 2 = { ^ ,x 2 , . . . ,x n ) | > 0 

iff i e a} for some a. 

Suppose in each quadrant contained in S we consider the norm of each 
point, and then select the minimum and maximum norms. 
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A convenient way of looking at crossover is to consider each point 
of S expressed as a linear combination of the basis elements, i.e.. 


.CD _ 


n 


= £ x. C1) e. 

i=l 1 1 


,( 2 ) _ 


n 


= D x^e. 
i=l 


Then if a e P({1,2, . . . ,n}) , k a {x^,x^} = {y^,y^} where 


,(D _ 


£x. (1) e. + 

. i i 

lea 


ipa 


y< 2 > , ExP’e. ♦ £>< 2 ), 

/ .,11 "11 

ipa 


lea 


Theorem 2 . 4 

The maximal radius of the M.B.S. achievable after m successive generations 
is min(v'iirQ, t/T^Tq) . 

Proof: 

(By induction) 

Basis : The bound after one crossover is /^r^. 

Proof: Let and be any two points in S. Then 


A d) 2 « 2 

£, X i 5 r o 

1=1 


n . 2 

, jr x C > < rj 

i=l 1 


if (y ( ‘ 1 ' ) 

II 

r*-> 
/ — \ 
CM 
' — ' 

X 

k {x^,x^> £yf 
a l i=l 1 

so that 

Ex? 3 < 

2r o k " 


i»l 

Induction: 

Assume the 

assertion true for m : 

x«> and x' 2 > 

by any two 

points in (k k 

a l °2 


2 n 

' ♦ E) 

i=l 


m 


n (i) 2 


-< * o . 

1=1 


" ' 2 > 2 < 2 "r 2 . 

i = l * U 
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By the same argument as above, if {y^,y^} = k 


m+1 

f 7 W 2 < C 2 m »2 m )r2 . 2* +1 r 

i=l 


For m > log 9 n, the bound established in Theorem 2.3 clearly holds. 


Corollary 2.4 

The bound is attainable if S contains a basis set and its reflection. 
The proof is similar to that following Theorem 2.3. 

Remark: The result in Theorem 2.4 is seen in more intuitive terms by 

observing that the crossover operators acting on the basis and reflection 
set yields upper bounds. Thus, if x^ and x^ are any two points in 
S, then for all {y^,y^} = k a {x^,x^}, we have that 


,(j) 


< 2r 


o = II Vi 


+ W 


for j = 1,2, any i,k, and clearly rgCe^+e^) is simply a result of k^^ 
action on {r^e^Tge^}. The extension to the general case is clear. 

Effectively, then, the proof of Theorem 2.4 reduces to the successive 
pairing of elements of {e^ , , . . . ,e n ) • We now establish the dual 
of Theorem 2.4: 

Theorem 2.5 

The minimal radius of the M.B.S. achievable after m successive generations 


is max 


( — — ^ 


Proof: 

We use Theorem 2.4. 

Suppose the initial set is bounded by an M.B.S. S^ with radius 

Tq. In m generations suppose the set of crossover operators employed to 

achieve the minimal radius is {k k . . . k } , where a is an index 

a, a a-ea' 

12 mi 
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set. Let the minimal M.B.S. radius be r m> Now apply the inverse of the 

set {k ,k , ... k } , on the points of such that the original 

a * a » a a. ea' r & 

12 mi 

points of S™ are generated in exactly the reverse order, generation 
by generation. 

By theorem 2.4, the maximal radius for the M.B.S. of is 

r 0 = rain{/ii r , r }. So t'n r > r n , or /2®r > r, 

2m m’ m m 0’ m 

C 0 __ . r 0 


m’ 


m 


m - 0 

r. 


r > — - or r > 

m /r m 


and the result follows. 

Remark: The minimal bound is in fact attainable. This may be shown by 

considering the set of points {— (±e ,±e„, . . . ,±e )}CS^\ and crossing 

Sn 1 £ n 

r 2") 

these over with the origin (0,0,..., 0) for m = 1; then crossing over S 
with the origin for m = 2; etc. Note that the set of points are simply a 
rotated version of the basis and reflection set. As an easy consequence 
of Theorems 2.4 and 2.5 we have 

Corollary 2.5 

(i) If S<» contains a basis and reflection set the maximal /n r^ 
bound is attainable in a minimum of l°g 2 n generations. 

M*) 27 27 

(ii) If S v J contains __0 {±e^,±e 2 , . . . ,±e n }, the minimal _0 bound 

/rT v'n 

is attainable in a minimum of log 2 n generations . 

Proof: 

From Theorem 2.4, on the m t * 1 generation the M.B.S. radius is 
min(» / n' Tq, r^) . Hence the minimum m for which > v'iT is simply 

m = l°g 2 n • The sec °nd case is similar. 

As a generalization fo Corollary 2.5, we look at the case when the 
maximum and minimum of coordinates in are not necessarily ir^, 
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i.e., some basis and reflection points may be missing. (We consider expan- 
sion theorems only, since contraction theorems are similar). Further, a 
single point may contain more than one minimum or maximum coordinate. 

We partition the set {l,2,...,n> of subscripts as described in the 
flow- diagram: 
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Thus, we end up with a partition {a^jC^, ... »<*£> of {l,2,...,n}, the 
cardinality of which is l, as indicated. 


Corollary 2.5(a) 

With a partition obtained as above, the minimum number of generations 
required to attain a maximal M.B.S. is logj-fc • 

Proof : 

Without loss of generality we may assume that {oij ,( 1 ^, . . . are ordered 
such that 

£ x i * £ *• if j > k 

ieou max iea^ max 

Clearly the optimal expansion rate is obtained by crossing over using 
the scheme 


pw 1 — 

*7 - 
m - 3 - 



e 5 


(>} 


£ 5 




<? 2 
I 




where a|,a^,a^, . . . ,a£ now represent the points whose a^ct^a^,. . . sub- 
scripts are coordinate - maximal or minimal. 

The scheme exhausts all of {l,2,...,n> when m = log^-fc . 

Remarks: A similar result holds for contracting M.B.S. It is observed 

that it is entirely possible for the maximal or minimal M.B.S. to be achieved 
in 0 generations, are indicated by setting t * 1. 

As a consequence of lemma 2.2 we have an interesting theorem whose 
proof is obvious. 
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Theorem 2.6 


Under the crossover operators the mean square distance of points 

from the center of the initial M. B.S. is constant. 

Corollary 2.6 
2 

If a is the variance of the distance of points from the center of 

the M.B.S., and x is the mean distance, 

2 —2 

o + x is constant. 

Remark: The theorem clearly holds for distances from any arbitrary point, 

since the proof of lemma 2.2 was free from any positional restriction. 
However our interest is mainly in the result of Theorem 2.6. 
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SECTION 3 


Crossover - A Markov Chain Model 


Consider the following situation. One is given m points, {x^}, 

each having n>2 coordinates such that t x?* ^ for all i / i' and 

0 < j < n. The m points were randomly selected. Random crossover will 

occur replacing the m points with m new points on which crossover will 

again take place, etc. Only the c^ operators will be used. One will only 

consider a particular point as follows: without loss of generality this 

point is _ o), x(t+l) is the point of c^({x(t),y}) which 

has most of the xj- J occurring in x(t), i.e., x(t+l) is the point which 

has the most coordinates in common with x(t). Let y = (x^l^, ...,x^n-*) 

1 -’n 

i. e., a point which may be obtained from the initial m points 
by crossover. 

Problem: What is the expected time for x(t) = y. 

This problem may be stated in terms of a Markov Chain as follows: 

xft) is in state £ if x(t) has exactly £ coordinates in common with y. 

Then if x(t) is in state £, x(t+l) must be in state £-1, £ or £+1 since 

a c^ operator was applied to x(t) and another point to obtain x(t+l). 

Let E. . be the event that x(t) is in state i and x(t + l) is in state j . 

Then P(E. . ) = the probability of choosing a c. such that x(t) has 

x , x J 

coordinate j in common with y since if c^ is chosen there is no point among 
the m points at time t other than x which has that coordinate value. There 
are i such c^ operators so P(E^ ^ = i/n. P(E^ = the probability 

of choosing a c^ such that x(t) does not have coordinate j in common with 
y and a point which also differs at j from y. There are n-i such c^ operators, 
Having chosen such a c^ there is exactly one point which agrees with y at 

j. Thus, any of the other m-2 points differs at j from y. Therefore 
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I 


pce. .) = (— 

v K n J m-l - ' 


i+1 same “ j except that the one point which agreed 

_ • *1 

with y at j was chosen so that P(E. . .) = (— — ) • (-—-=-) . 

i j i ' i xi m- j. 

None of the transition probabilities depend on more than the present state. 
Therefore this is a finite Markov Chain. Thus one may use Markov terminology 
to derive facts about the system as follows. 

All states communicate since if i and j are two states there is a 
path from i to j with nonzero probability. Therefore the chain is irreducible. 
All states are aperiodic since there exists no r > 1 for state i such that 
any path from i to i has length sr for some s s N. By Theorems 1 and 4 
pages 391 and 392 of Feller (1967) all states of the chain have the same 
type and this is neither null nor transient therefore all states are 

(£) 

ergodic. By the theorem on page 393 (same book) the limits u. = lim p: i 

K £->oo 

exist and are independent of initial state j. Also u^ > 0, - 1 and 

u. = Z-fU.p. . and u. = 1/p. where 

1 Y 11 >1 * k 

U, is the mean recurrence time of state k, p. . = P(E. .) and p^ is 

k i,3 i,J ,3 

the probability of going from state i to state j along some path of length l. 
Since p^ is given for each i and j, one can solve for the u^. Let 
U = Cu 0 u r ..u n ), P s = [P ifj ] (n+1)x(n+1) then u. =Zu.p. ^ <=> UP g = U. The 

general P matrix is in appendix 3.3. Thus U(P -I) = 0 and Vu. = 1 

5 S ^ 1 

1 

so we have to solve 


P -I 
s 


= ( 0 ,..., 0 , 1 ) 


lx(n+2) 


(n+l)x(n+2) 

Therefore the mean recurrence time for i may be found given n and m. 

The expected time from state i to state j may be determined as in appendix 3. 
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Let p^ denote the probability of starting in the ith state. Then 
Pi - therefore E[i] - £ kp k - n i. 


cr ,. s 2 1 2 .. 1..2 1 , 1.2 , £ 2 l,m-l.. n(m-l) 

L ^ J J v nr m '•mr m v m J J2 


n-1 

E[time to n] = P- E[i to n] . 
i=0 1 


It is obvious that increasing the number of goal points decreases 
the expected time till a goal point is reached by the point which is under 
consideration. However, since the probabilities of reaching two different 
goal points are not independent, it is not immediately obvious how to cal- 
culate the expected time. It is also obvious that a point not being con- 
sidered may reach a goal point before the point under consideration. Thus 
a more general problem is to determine the expected time till a point of 
the m points reaches a goal point. The probabilities involved in this 
problem become extremely complex but may be approximated in the near future. 
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SECTION 4 


Crossover - Special Heuristics 

Having examined the deterministic bounds on crossover in Section 2. 

the next logical step is to examine the probabilistic properties of some typical 

heuristics employed in implementing crossover. A natural question corresponding 

fil* 

to Theorems 2.4 and 2.5 is the following: If r^ J is the distance of point 

x from the origin, what is y(r^) and o(r^)? Clearly, the answer 

X X XX 

depends on how the initial (0 in ) generation of points are distributed. 

Notation : A uppercase letter, say Z, denotes the random variable Z; 

lower case letter, say z, denotes its value. and f^ are the 

distribution and density functions of Z respectively. y(r is 

x x 

the expectation of r over all points x. 

4.1 Volume - uniform distribution 

In the case of a volume-uniform distribution of points within a 

hypershere if we asstime high d-imens'ional'ity of the space R n , then by the 

"sphere-hardening" property it is a very good approximation to simply 

scatter points randomly about the surface of the hypersphere. One way 

of doing so is by generating points after the fashion: 

Let {Y^ } i=l,...,n, be a sequence of independent random variables 

with uniform probability density 

fy (°0 = j -1 < a < 1 

l 

= 0 elsewhere 



then X = (Xj,X 2 , . . . ,X ) will be such that ||x|| = 1, i.e., lie on the 

surface of a hypersphere of unit radius . 

*The superscript (i) refers to the i** 1 generation of crossover. 
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A sequence of such points X, generated by the above process (the 
y^'s may be approximated by some suitable random number generator), will 
approximate a high dimensional volume-uniform distribution of points 
in a hypersphere of unit radius. 

4.2 Coordinate-bounded distribution 

In this case a point X is generated by letting each of its coordinates 

be the value of a uniformly distributed random- variable X^, bounded in 

the interval [-a, a], a > 0. Clearly, this is not a volume- uni form distribution. 
However this is the method which was used in the practical implementation 
of the genetic algorithms. 

4.3 Monte Carlo simulations 

Partial analytical solutions of the questions posed at the beginning 

of this chapter are postponed to the next section. Here we shall present 

results of Monte Carlo simulations as an indication of the kind of answers 
one might expect using the distributions discussed in 4.1 and 4.2. 

The type of crossover heuristic which is conceptually the simplest 
pairs off random points and randomly chooses the segment (i.e., sequence 
of coordinates) that is to be crossed-over. Care has to be taken in the 
program to ensure that the pairing is unique, so that every point is 
crossed-over once and only once every generation. 
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In order to observe the effects of (i) initial distribution and 
(ii) dimension of the space on the effectiveness of crossover as a 
search operator two simulations were undertaken. The number of points 
used was 100, and unit radius was employed for the initial distribution 
of 4.1, while the initial coordinate-bound of 4.2 was set to [-1,1]. 

At the end of each generation of crossover the maximal, minimal and 
average distance of the 100 points from the origin was computed. The 
standard deviation was also computed. 

Graphs 4.1 and 4.2 show some typical results. Both cases indicate 
that an asymptotic value for maximal and minimal distances [which 
approximate bounding radii corresponding to Theorems 2.4 and 2.5) are 
reached within a few generations. The conclusion is that while 
Theorems 2.4 and 2.5 do yield theoretical bounds, with these heuristics 
the bounds are not realistic. In other words, the probability that an 
initial distribution of points will be chosen together with a probable 
succession of crossovers so as to approach these bounds, is very small. 

The search space of successive crossover generations is thus constrained 
to lie approximately between two hyperspheres which is not appreciably 
different from that region demarked by the first few generations. 

An interesting feature of the coordinate-bounded results is that the 
standard deviation is almost constant though increasing generations as 
well as increasing dimensions. This is not the case in the volume-uniform 
distribution where increasing dimension reduces the standard deviation. 
Average distances in both cases were remarkably constant in successive 
crossovers . 
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Distribution 


i 

i 



I » 


<o 


STD DEV 

Dim io and ao 


ON 


4.4 Partial result for uniform distribution 

We present an outline of the analysis for the first generation only. 
Recall from 4.1 that for two typical points X*- 1 -* and X*- 2 ^ we have: 



Suppose a segment of k coordinates was selected randomly and crossed between 
and X^. 


Write: 



( 2 ) 


where £=2 for k subscripts and &=1 for the remaining n-k subscripts. 

There are (£) ways of choosing these subscripts in the case of 
generalized crossover, and n-k+1 ways if crossover is restricted to 
consecutive coordinates. 

Since crossover operates on pairs of points it is desired to find 

p = Expectation [ w k ( R k +R n _ k )/ 2 ] (3) 

a 2 = Variance [ w k C\ + R n _ k )/2 j (4) 

where the expectation and variance runs over all possible pairs of k, n-k 
segment lengths. Depending on whether we restrict crossover to connected 
segments or allow disconnected segments (generalized crossover), the 
relative weights to be attached to each pair will be different. To 
fix attention, we choose connected segments, so that ' W, = - 

^ Ti 

k = l,...,n-l. Then, (3) and (4) reduce to 
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y 


CT 


2 



• Expectation (R^R jP 
Variance CRj c + ^ n k ) 

{Expectation [C R k +R n _j c ) 2 ] - y 2 } 
{1-y 2 } 


C5) 


( 6 ) 


since by Theorem 2.6 the mean square distance of points from the origin 
is constant. Hence it is sufficient to determine y. We indicate one 
possible development without claiming that it is the simplest or the most 
straightforward. 

From (5) and the linearity of expectation. 


y 



[Expectation (R^) +Expectation (R^ ^)] 


( 7 ) 


so that it is sufficient to find the expectation of a typical R^. To this 
end we look for a distribution function for R^. Now, 


FpCcO = Pr{r k < a} 


Prtr 2 < a 2 } = F^a 2 ) 


showing that it is enough to consider F D 2 (3) = F (/if) 

\ R k 


Since 




*i±yi 1)2 +± 

i=l i=k+l 1 i^l 1 i=l 

k /- -i 2 JV -i 2 k n { 91 ^ 

Pr{Zci-3)y, (1) - £ ey k ci) * £ ey^ - £ Ci -e)y} 2) > 


( 8 ) 


(9) 


i=k+l 
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it is evident that to evaluate F D 2 (B) it is necessary to determine the 

K i c 

fl") f 21 

density functions of and 1 , where these are the random variables 

whose values are on the left and right side of the event space in (9 ) . 

Clearly, by the way the y?^ and yP^ are generated, and S ^ 

are independent, so that the joint density function 


f S S = £ S ^ 

k l k 2 k l k 2 


CIO) 


which will be useful when it has to be integrated to yield an expression 
for (9) . 

The forms of and Sj^ are similar, so that we will consider 


a typical 


s = £ (l-BM - £ By^ 

nTl 1 i=k+l 


(ID 


2 2 

First, observe that the mean y^ and variance Oy of y^ are (from the 
uniform density of in 4.1) given ^ = I 5 a Y = 4l ’ as ma ^ eas ^ly 

verified. 

Next, split S into two random variables and T n _^, where 


? k 9 


T k " * .£ s ’>'i 

i=l 1=1 


n 9 

Tn " k = 5+1^ 


( 12 ) 

(13) 


and then S = T^-T^ Now appeal to the Central Limit Theorem to yield 
approximate densities for T^. and T^ namely. 


f ? (a) = — exp [- (a-y, ) 2 /2a^] (14) 

k a, v2R 

k 
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n-k 


where y, = B'k/3, y , = B'(n-k)/3, a 2 = k/45 and a 2 , = 4(n-k)/45. Observe 
k n-K K n-K 

that and ^ are -independent from the way the y^'s are generated, so 
that S = T^-T n ^ has an approximately normal density function given by 

f s O) = exp [- (a-y s ) 2 / 20 g] (16) 

0 s / 2 n 

2 22 

where Og = a ^ +CT n ^ and Ug = . In principle we have obtained 

densities f„ (a) and f c (a) , so that from (9) 

b k b k 

K 1 k 2 


Fr2 (B) - JJ fg g Cn,0 dydc 

* A k l k 2 


where A = {(n,C)| n 5 and by (10) the integral may be factored into 

f„ (n)’f c (?) • With the obvious notation, (17) may be rewritten 
K 1 K 2 


F r 2(B) = 


v?- /*/' 

12 Jq J Q 

- 4 OT f eicp [H 

b l b 2 JQ 


exp [-(£-y s ) 2 / 2ct s 1 ex P ["Cn-y s ) 2 / 2a s ^ dn C 18 ) 


exp [-(£-y s ) 2 /20g ] erf[(?-y s )//2 0 g ]d? (19) 


where erf is the error function. However, to determine the density of 

it is necessary to differentiate either (18) or (19) with respect to 

B, recalling that in fact y c ,y„ ,cs ,0 are functions of 8 . 

b l b 2 b l b 2 

The analysis was terminated at this point . 


39 


4.5 Partial results for coordinate-bounded distribution 


In this case we have almost exactly the same development as in 4.4 
up to equation (8) , and we observe that in this heuristic the weights 
may be regarded as equal. 


where 




n _ 

- E* 

i=l 


f X. (o ° 
1 


1 _ 

2a 

0 


-a S a < a 
otherwise 


so that 


and 


F x 2 (a) 
i 



f x 2 (a) 
i 



0 < a <. a 

2 

a > a 

2 

0 < a < a 


a > a , a < 0 


( 20 ) 

( 21 ) 


( 22 ) 


(23) 


From here, we may proceed as before. (An alternative route would 

be to consider characteristic functions, but the transforms are not easy 

to evaluate.) The mean y v 2 and variance a v 2 „ „ 2,_ , 

J X^ X^ of C23) are a /3 and 

4a^/45. Then 


f R 2(°0 = — : exp [- (a-y) 2 /2a 2 ] (24) 

K a/2H 

where y = na^/3 and a ^ = 4na^/45. The analysis is clearly simpler in 
this case than in 4.4. 
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SECTION 5 


The Algebraic Structure of Inversion 


5.1 

The genetic process of inversion shown schematically in Figure 5.1 
may be interpreted, as pointed out by Zeigler, as a change of basis 
transformation. Representing the loci as a basis set {f^}? =1 , a string 
may be represented as (X^X^Xg, . . . ,X n ) where X^ is the allele at locus 
i. Then an inversion on such a string from locus k through i l may be 
represented as k S, 


+ 4- 


c ^ 
X 1 

x 2 


/• 

i 

i 

i 

1 

1 0 
1 

1 

! 

0 

i 


r 

X 1 

X 2 

*k-l 


0 

1 

1 

1 

! 


X k-1 

x * - 

— . — 

- 

— 0 

1 — 

— 


Vl 

• 


0 

1 0 1 
1 1 0 

! 

1 0 


^+1 

\ 

— — 


1_ _ 

-0 

— 



• 

0 

0 

o 

i — i 

o 


X * + l 

X 

L n ^ 




1 J 


X 

n 


or 

y = TX 

and it is seen that T is a change of basis transformation, in fact one 

z 

that "reverses" the ordering of the subset {f^}^_^ 

An alternative description is possible. Let the Euclidean space 
with ordered basis (f^,f2 , * • • >f n ) be denoted VSTR. Then a chromosome 
is simply a point in VSTR-space, and an inversion maps VSTR into 

VSTR. In fact, if we denote by a = (i, i+l,...,k), the operator 
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Fig. 5.1 - Inversion on substring 5-6-7-8-9. 


VSTR 

space 


X space 


*^1*^2*“ **^r 



Fig. 5.2 - The process o£ inversion. 
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which maps a point (X^ , . . . ,X^) c VSTR into (y^ ,y 2 * ' ' ' ,y n^ £ VSTR defined 

as follows: 


y± = x k 

y i + i = \-i 


y k - x i 


and y. = X. otherwise, 
J J 


then faithfully represents the action of inversion over segment 
(i,...,k) of the coordinates of a point in VSTR. 

Examining the action :VSTR -> VSTR more closely, we quickly see 
that it is isomorphic to a permutation of a special kind, namely, a 
product of disjoint transpositions: 


% (i,k) (i+l,k-l) ... (p,q) 

where p = q = (i+k)/2 if k-i is even 

p = (i+k)/2 - y , q = p+1 k-i is odd. 

so that { A ^ctco* co ^ ec ^ on of all inversion operators is isomorphic 

to a subgroup of the group of permutations. (That it is actually a subgroup 
is clear, if one allows the null inversion to be regarded as an identity). 
Generalized inversion 3 defined like its counterpart generalized crossover 
in Section 1 , is then seen to be isomorphic to the group of permutations 
itself. From this it is clear that {& } „ is noncommutative, admits a 

a a eCL 

composition, and has precisely n! generalized operators if dim(VSTR) = n. 

We now link this up with Zeigler’s interpretation. A point (X^ ,X^ , . . . ,X n ) 

in VSTR after a few inversions ° 0 • . . ° , will be 

a l a 2 a k 


*a is the power set of {l,2,...,n}. 
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CX. ,X. ,...,X. ) where (i.,i 2 , . . . ,i ) = (1,2,3, ... ,n) ° 

1 1 2 ji n a l a 2 a k 

where each A is regarded as permutations as above. We may represent 

“j 

the inversion pattern of (X. ,X. ,...,X. ) very vividly as a matrix $> in 

1 1 x 2 x n 

the fashion: 

where d>„ =1 if i = l 

T £m m 

= 0 otherwise 

So, for example, (X 4 ,X^,X 3 ,X 3 ,X 2 ) will have a matrix of 


$ = 


0 1 0 0 O' 
0 0 0 0 1 
0 0 0 1 0 

1 0 0 0 0 

Lo o i o o 


By the nature of its construction each row and column of $ can have 
only one 1. An inversion operator, represented as a T matrix earlier on, 
operating on a $ matrix by post multiplication will yield a new $ matrix 
which represents the new inversion pattern of the point. For example, 
if is the inversion operator, its T matrix is 


T 


1 0 0 0 0 " 

0 0 10 0 

0 10 0 0 

0 0 0 1 0 

.0 0 0 0 1 


and 


3>T = 


0 

0 

0 

1 

Lo 


0 10 
0 0 0 
0 0 1 
0 0 0 
10 0 


0 

1 

0 

0 

0. 


(x 4 ,x 5 ,x 1 ,x 3 ,x 2 ) 


The proof for this algorithm is obvious but very awkward to write out, 
and is best left to the reader. Thus, the $ matrix is obtainable by successive 
post multiplications of the T a matrices corresponding to each 4’ 
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I 


beginning with the identity matrix corresponding to the natural ordering 
of coordinates. 


5.2 

In nature, when a chromosome undergoes inversion each locus nevertheless 
is intrinsically identifiable, i.e., the map that interprets the alleles 
associates each of them with the correct functional locus. In the case 
of our model, this is equivalent to saying that when a point in VSTR-space 
is to be evaluated, we must permute its current inversion "state" back to 
the natural ordering of the coordinates. So, in the last example there 
should be a map such that 

(x 4 ,x 5 ,x 1 ,x 3 ,x 2 ) -v (x 1 ,x 2 ,x 3 ,x 4 ,x 5 ) 

In fact we already have a representation for such maps associated with 
each point. It -is simply the <J> matrix itself. For example, in the case 
discussed when (X^X^jX^ ,X 3 ,X 2 ) was the current inversion pattern, if we 
multiply $ and (X^X^X^X^X^ , i.e.. 


"0 

0 

1 

0 

0“ 


r x 4i 


r x i *i 

0 

0 

0 

0 

1 


X 5 


X 2 

0 

0 

0 

1 

0 


X 1 

= 

X 3 

1 

0 

0 

0 

0 


xl 


X 4 

0 

1 

0 

0 

0 




X* 







L 2 J 


L 5 J 


which recovers the natural ordering of the coordinates. That this is true 
in general follows from the easily verified fact that the $ matrix is 
also isomorphic to the inverse permutation of the inversion pattern which 
it represents. 

It is emphasized that aross-over is carried out in VSTR space only 
between points which have the same inversion pattern. This is so because 
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each coordinate represents a locus and only two chromosomes (points) 
whose loci (coordinates) are in the same order (inversion pattern) may 
have corresponding subsegments (substrings of coordinates) interchanged 
in a meaningful fashion. 

If a real function is defined on an Euclidean space X, then f:X -*■ TR. 

In order to exploit genetic algorithms each point x e X is represented 
as a list of n coordinates in VSTR- space, which is some permutation of 
its natural representation in X-space. Then we may view the inversion 
process pictorially as in fig. 5.2. Given x e X, the embedding map takes 
it into VSTR (with its natural ordering preserved, of course), so that 
y is an isomorphic copy of x. Inversion operators ^ 

12 r 

operate on y and move it around in VSTR space. As described previously, 

these operators are also representable as matrices T T , . . . ,T . Finally, 

a l a 2 a r 

when we wish to evaluate f(x), we map the y' point in VSTR back to X 
via map $, which is the matrix associated with the inversion pattern of 
y' . Observe that in the realization of genetic algorithms in the companion 
of this report* $ is precisely the ISTR vector. 

5.3 

The concept of genetic linkage suggests an interesting measure of the 
"inversion distance" between two points in VSTR-space as distinct from 
the Euclidean distance between them. We define the 

Inversion distance between y and y' in VSTR space as the minimum 
number of simple (non-generalized) inversion operators which must be applied 
to the inversion pattern of y in order to yield the inversion pattern of 
y'. Denote this by dj(y,y'). 

*Bosworth, et^ al_. , 1972. (NASA CR-2093) . 
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i 1 1 hi 


I II 


I III 


Clearly, djfy^') = djfy’ ,y) 

and djfy.y') < djCy.y") + dj(y",y') 

for all y" in VSTR. 

Also djCy.y) = 0 trivially. 

So we have that (VSTR, d^) is a metric spaae. 

The importance of this concept is evident, say, when we try to assess 
the effectiveness of genetic-like algorithms with respect to a parameter 
which controls inversion strategies. Suppose we know the optimal inversion 
pattern (in the companion to this report* we cite several examples of functions 
where we do know this) in advance. Then an ordering of algorithms may be 
obtained by examining how quickly the mean inversion distance is decreased 
between the initial points I (X) = Y and that of an optimal point. 

It is easily verified that if dim(VSTR) = n, then dj(y,y') < n-1 
for all points y,y' in VSTR. We have had partial success in looking for 
an algorithm which yields d^Cy^'), given y,y' , but limitations of time 
did not permit us to pursue it to its conclusion; so this is still open. 

The main point, however, is that with (VSTR, d^) as a metric space, it 
is meaningful to ask questions which have to do with rates of inversion 
pattern "convergence”. 

Remarks : It is clear that the above discussion can be more elegantly treated 

as an exercise in group representations, precisely as the subgroup of 
permutation matrices embedded in the general linear group. However it is 
felt that the intuitive approach is more suggestive of the programs developed. 


*Bosworth, et al . , 1972. 



SECTION 6 


Inversion - A Geometric Interpretation 


In Section 0 and again in Section 5 we looked 'briefly at inversion. 

Here we describe one other interpretation. The VSTR - space of Section 5 

may be regarded as the Cartesian product of an Euclidean space Y and a group 

of permutations T. One may visualize inversion as carrying the space Y 

through "permuted" copies of itself, each copy being labelled by an element 

of T. The crucial observation is that if we project Y x T + Y, and examine 

the effect of inversion by observing the effect on projected points in Y, 

some interesting properties are revealed. It may help to refer to fig. 6.1 

to help clarify the above remarks. 

The results of this section are concerned solely with inversion as 

observed on the space Y. Referring to fig. 6.1„ x' is an "inverted" image 

2 

of x, and we project x’ to x" in E x(l,2) - it is clear that it does not 
matter in which "layer" of Y x T we choose to work. 
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Definition 6.1 


Let x = (xj ,x 2 > ■ • • iX n ) be a point in Y. Then the inversion orbit, 
denoted 0(x), is the set of points resulting from some inversion of x, 
i.e., permutations of the x^ 

Definition 6.2 

A plane poly tope is one which lies entirely in a hyperplane. 


Theorem 6 . 1 

Let x = (x ,x , . . . ,x ) e Y. 0(x) forms a plane polytope with 
1 *• n n 

centroid 3(1,1,...,1) where 6 = — Sx. and the plane of the polytope is 

n i=l 1 

orthogonal to the radius vector (1,1,..., 1). The vertices of this polytope 
lie on a hypersphere with the centroid as center. 


Proof : 

Let 0(x) = {p^ 3 p^ , . . . ,p m ) . In the coordinate representation of x, 
if a coordinate value is repeated; denote the number of times it is repeated 


by r. 


Then m = 


r I r ! . . . r* I 

r r r 2 l »• • • » r k- 


, if k coordinate values were repeated. 


r l ,r 2 , *‘' ,r k t ^ mes respectively, m = n! if and only if no coordinate values 
are repeated. 
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The i t}l coordinate y^ of the centroid is given by 

1 m 

y i ’ m jWi 

where (p.). is the i th coordinate of p.. In 0(x), each x^ appears in a 
1 ^ J 


given coordinate i exactly mr & times. 

n 


Hence 


1 „ mr S, x £ 


m 


n 


■ Biifi 


= 3 

For any p., the coordinates of p^ are some permutation of 

1 n 

). So (Pi)- - 3 = x E x . for some k.. 

V 2' n J k. n i = i i x 

i 

The vector q_. joining p^ to the centroid has i^ coordinate (Pj)^ - 3. 

It is sufficient to show that <q. , (1,1,...,1)> = 0 for all j, or equivalently, 
m 

that E fp.j. -3=0. But from the above this sum reduces to 
i=l J i 

n i n 

.1 ,x - n-- Ex. = 0. 

k . =1 k. n i = i i 

l l ±l 

So 0(x) does form a plane polytope orthogonal to (1,1,1, ... ,1) , with 
centroid 3(1 , 1 , . . . , 1} . 

The points of 0(x) are equidistant from the centroid, since by the 

generalized Pythogoras Theorem, supposing p e 0(x) 

| |p | | = Ex. a constant for any p e 0(x) 

i=l 1 


and 


3 (1 , 1 ,...,) | | = nB a constant for 0 (x) , 


so that Tq = | |p - 3(1,1,. . . ,1) | | is a constant. The conclusion is that 
the plane polytope formed by 0 (x) is circumscribed by a hypersphere of radius 

V 


Corollary 6.1 

The inversion orbits (0(x)} y partition x. 

X £ A 
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Corollary 6.2 

diam 0(x) = max Mp 1 -P 2 II 5 2 | |p-3(l,l, . . . ,1) | | 

P 1 ,P 2 e°Cx) 

Corollary 6.3 

All points in an inversion orbit yield the same function value. 
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SECTION 7 


Metrics on Sets of Strategies 

In dealing with genetic algorithms to optimize functions we note that 
there are at least countably many heuristics. Each of these heuristics may 
be said to be a strategy. The question naturally arises as to how we are 
to compare strategies. There is an (indefinite) intuitive notion, for 
instance, of the "nearness" of two strategies. 

In this section we propose one measure to compare strategies, and 
couch its development in a game-theoretic vocabulary so as to give it 
an interpretation. 

Notation: Let 

be the set of all strategies] 

be the set of all game configurations enumerated from all 
possible game trees, possibly with repetitions; 

03, be the set of all distinct game configurations. 

Remarks: We assume that both and & can be effectively enumerated. 

Clearly, and a given element b may be enumerated several 

times over in For a finite game \03' | < ». Each element of 0F, 

say f e maps & - gg'. 

Intuitively we would want two functions to be identically equal 
iff they map any b' to the same next game configuration: i.e., two 

strategies are said to be equal if 

fjCb') = f 2 (b ’ ) Vb' 

To get at the notion of the "nearness" of two strategies, it is 
possible to define a function d:f x f iRas follows: 

= Prttl^Cb) * f 2 Cb)> 

where Pr is a probability measure on 03' . This is intuitively satisfactory 
on several counts. For game configurations which appear "often", their 
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enumeration in &' is repeated, so that they contribute proportionately 
to the measure. In games with large numbers of configurations the above 
definition can serve as the basis of a "Monte Carlo" type estimate of 
the (difference) distance between two strategies. 

Theorem 7 . 1 

d as defined above is a metric. 

Proof: 

(i) d(f r f) = PrttilfjOO j f x (b) } = Pr(0) = 0 

(ii) d(fj,f 2 ) = d(f2»f^) > 0 is obvious. 

(iii) we have to prove the trangle inequality 
d(f lt f 2 ) < d(f 1 ,f 3 )+d(f 3 ,f 2 ). 

For any f 3 eg? 

{blfjCb) = f 2 (b)}2{b|f 1 (b) = f 2 (b) = f 3 (b)} 

= {b | f 2 (b) = f 3 (b)}A{b|f 2 (b) = f 3 (b)} 

Take complements: 

{b | fjfb) t f 2 (b)}S{b|f 1 (b) i f 3 (b)} 

U{b|f 2 (b) + f 3 (b)} 

which implies 

Pr{b | fj (b) t fjOO) " Pr ^ b l C b ) t f 3 (b)}+Pr{b|f 2 (b) t f^b)} or 
d(f 1 ,f 2 ) < d(f 1 ,f 3 ) + d(f 3 ,f 2 ) 

Remarks: In a finite game, m < «, so that the probability measure 

Pr is simply a counting measure after this fashion: if is the number 
of occurrences of b e SS' which satisfy Pred(b) , then 

n, 

Pr{b |Pred(b) } 


Some unsatisfactory points are now observed. It is not entirely 
clear how one could modify the definition of the metric Cby using weighted 
metrics) to account for the fact that some game configurations are "more 
critical" than others. Even the same configurations appearing at different 
levels of game trees may have to be differently weighted. Again, supposing 
dCf r f 2 ) = d(f lf f 3 ), and let 

6 2 = tblfjfb) 4 f 2 Cb)> 

6 3 = (blfjCb) 4 f 3 (b)} 

then even though |S 2 | = 1 6 3 1 it may be that the set 6 2 has most of its 
elements appearing early in the game trees, while 6^ has most of its 
elements appearing late. 

These second order effects are not yet considered. 

Corollary 7.1 ( ^T,d) is a metric space. 

Adaptive Plans and Optimal Strategies 

Suppose there is a strategy which is optimal in the sense that it 
assumes a win for any tree. A good adaptive plan is one which, despite 
false starts, eventually picks on such an optimal strategy. 

To formalize this, an adaptive plan P is a function which maps strategies 
into strategies, i.e., P: and the set of all adaptive plans is 

denoted by 

However, in most implementations of adpative plans , there is involved 
a payoff or penalty function. We choose to use a penalty function. 

For a fixed p e let f Q be the original choice of a strategy. If 
Vq is the initial penalty then the aim of p will be to reduce {p n > to 
zero as quickly as possible by judicious choices of {f }. So more accurately, 

p:^ x U 
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I 


where U is the set of penalty functions associated with strategies. 

Obvious choices of penalty functions are (i) monotonic functions 
of metrics (ii) cumulative density functions of metrics. 
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APPENDIX 2.1 


A simple numerical example to illustrate the spherical bounds on 
crossover: 

Let x^ = (1,2,3,43 
x (2) = (3,2,1,03 

Suppose we crossover coordinates 1 and 4. 
y C1) = (3,2,3,0) 
y C2) = (1,2,1,4) 

Now the mid-point of x^ and x^ 2 ^ is (2,2,2,23 = x n . 


l* C1 U 


2 ,2 n 2 1 2 „2 

= 1 +0 +1 +2 = 


I (23 

x -x. 


1,2 

^O 1 


|y — -x n I r = l 2 +0 2 +l 2 *2 2 = I |y ' “X 


C 2; U II 2 
o 1 1 


Also, note that y^-x^ = (1,0, 1,-23 and y^-x^ = (-1,0, -1,23 = 


(y (1) -x 0 3. 
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APPENDIX 2.2 


In the definition of a minimal bounding sphere (M.B.S.), it was 
stated that time did not permit refined proof and arguments on details . 
However it was asserted that at least one bounding sphere exits. This 
appendix presents a method for finding one such sphere. 

Let S be a bounded set of points {x*- 1 ^ . . . ,x^ } (as usual we assume 
S in finite ) . 


Let 

X k 

max 

= max {x£^ 1 
i 

x^ 

e S} 


x k 

= min (x^ | 

■ K 

x^ 

E S}. 


K min 

1 



Let 

II 

o 

( x k *\ 
max min 

for all k 


,(i) _ _ Y (i) . 


x n = (x ,x ,...,x ). Then let J '-x = K ' ' \ 

0 °i ° 2 °k k * °k k K m ax min 


- i_ x i i - r x Ci)_ x 1 

' 2 U k X k } f 2 1 k X k J 


, (i) 

and r^ = max 
i 


max 


2 

2 A 1 


mm 


2 2 

Then define a radius r = (V r, ) . These define a bounding sphere. 

U k=l 
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APPENDIX 3 


3.1 Given Ug,...,u n> the expectation of the time to go from state i to state 
i, E[i to i] = pu = 1/tu. p^ j is known (transition probability from state 
i to state j ) . 

p , - 1 therefore E[n to n] = E[n-1 to n]+l => E[n-1 to n] = E[n to n]-l. 
n ,n- X 

E[n-1 to n-1] = P n . 1>n . 2 '( E f n - 2 to n_ 1 1 + 1 ^ + P n _ 1 1 n - l +P n- i , n * 2 => 

E [n-2 to n-1] = E t n ~ 1 to n ~ 1 ^Pn-l,n-2~ P n-l,n-l . These values suggest 

p n-l,n-2 

an algorithm. Let 0 < i < n-1. 


E [i to i] = p.^ i _ 1 *(E[i-l to i] +1) +p^ i +p i i+1 *(E[i+l to i]+l) where the 
p^ ^'s and E[i to i] are given and 

E[i+1 to i] = P i+1)i + Pi + i } i + i*( E t i+1 to i]+l)+p i+lji+2 -(E[i+2 to i+l]+E[i+l to i]) 

where it is assumed that E[i+2 to i+1] has been previously calculated. 

Thus the expectations, E[i-1 to i] , may be calculated for each i such that 
0 < i < n. 


Given E [t to j] for both l = i and l = i+1 where 0 < i < j-1 
E[i to j] = p i il *(E[i-l to j]+l)+p i ^ i * (E[i to j]+l)+P i>i+1 * (E[i+1 to j] + l) 

and E [i to i+1] = p i i _ 1 »(E[i-l to i+l]+l)+p i i '(E[i to i+l]+l)+P i ^ i+1 

for each 0 < i < n. Therefore E[i to j] is determined for each i and j 
such that 0 < i < j < n. 
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A NUMERICAL EXAMPLE 


3.2 Let n = m = 3 


0 1 


then 


p = 0 


so that U 


P “1 
r s 


0 I 

0 0 


= 0 => 


3 

0 

0 

_1 

6 

0 


1) 

2 ) 

3) 

4) 

5) 


1 

2 U 0 


1 

2 U 0 


1 

3 U 1 


6 U 2 


1 

3 U 1 


2 2 
3 u l + 3 u 


6 U 2 + U 


2 

3 


0 

0 


u, = 


0 


u 0 + U 1 + U 2 + U 3 


1 


15 => U 1 = I u 0 

3) + 4) => u 2 = i u 2 
and 4) => u 3 = \ u 2 
1 3 , 3 

therefore u 2 = | ?J 

u 3 = i therefore P 0 = X ’ V 1 = 7Z’ y 2 = and V 3 = 27 ' 


1,0 u 3 8 V u 2 4 u 0 3110 U 1 2 u 0 


3 3 1 8 

5) => 11_ + rr U„ + — u„ + -rr u„ = 1 => u„ = 

J 0 2 0 4 0 8 0 0 27 


Using the algorithm described in Appendix 3.1, one obtains the following: 
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E [2 to 
E [2 to 

E [1 to 

E [2 to 
=> E [2 

E[1 to 
=> E [0 

E [2 to 
=> E [1 
E [1 to 
=> E [0 


3] = 27-1 = 26 

2] = P 2 , 1 C 1 t0 2 l +1)+P 2,2 +2p 2,3 



2 

3 


1 ] = j + 1 • (E[2 to 1]+1) + § (E [3 to 2] +E [2 to 1]) 
to 1] a - i - i ) = | * i + i = 1 => E[2 to 1] = i‘1 = | 

1] = j(E[0 to 1] + 1) + j + §(E[2 to 1]+1) 

H _ 1 _ I . I . 5 = 9 9 

to 1 ] = 12 3 3 3 2 • 3 = -r 

3 

3] = §(E[1 to 3]+l)+i(E[2 to 3] + l)+~ = 26 
to 3] = §(26 -f- §*-£■)- 31 

3] = §(E[0 to 3] + 1) + §(E[1 to 33+1) + §(E[2 to 3J+1) = 31 
to 3] = 3(31 - § - - 9) = 33. 
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